Enhancing tagging performance by combining knowledge sources

نویسنده

  • Lars Borin
چکیده

The topic of this paper is an ongoing effort to exploit combinations of existing natural language processing (NLP) resources in order to reach part-of-speech (POS) tagging performance in excess of that which any single resource is able to provide. The context of the effort is the ETAP project, a parallel translation corpus project funded by the Bank of Sweden Tercentenary Foundation. The aim of the project is to create an annotated and aligned multilingual translation corpus which will be used as the basis for the development of methods and tools for the automatic extraction of translation equivalents for applications such as machine translation systems. To this end, we are investigating to which extent it is possible to reuse existing – meaning either developed in our department in some other context, or freely available on the WWW – NLP resources for the task of tagging the languages of the project. As a general rule, we may say that the amount of such resources is growing quite fast at the present time. On the other hand, their availability is highly dependent on the language, from almost unlimited numbers for English,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Knowledge Sources For Automatic Semantic Tagging

In this working session, we will discuss methods which could plausibly be used for combining evidence for assigning semantic tags to words in a text. We will discuss methods that apply at knowledge acquisition time to produce a single static knowledge source to be used by a single, complete, semantic tagger, as well as methods for dynamically combining outputs of a set of independent, possibly ...

متن کامل

An evaluation of enhancing social tagging with a knowledge organization system

Traditional subject indexing and classification are considered infeasible in many digital collections. Automated means and social tagging are often suggested as the two possible solutions. Both, however, have disadvantages and, depending on the purpose of use or context, require additional manual input. This study investigates ways of enhancing social tagging via knowledge organization systems,...

متن کامل

Word Sense Disambiguation using Optimised Combinations of Knowledge Sources

Word sense disambiguation algorithms, with few exceptions, have made use of only one lexical knowledge source. We describe a system which performs unrestricted word sense disambiguation (on all content words in free text) by combining different knowledge sources: semantic preferences, dictionary definitions and subject/domain codes along with part-of-speech tags. The usefulness of these sources...

متن کامل

Investigating the Use of Paratactic and Hypotactic Conjunctions among Iranian Pre-university Students

In an attempt to dispel the persisting fallacy that an individual’s grammar knowledge is indicative of the way they put this knowledge into practice, this study seeks to highlight the inconsistency which resides between one’s competence and performance in the domain of conjunctions. It aims to shed light on the discrepancy which lies between the knowledge and production of conjunctions. The res...

متن کامل

Old Swedish Part-of-Speech Tagging between Variation and External Knowledge

We present results on part-of-speech and morphological tagging for Old Swedish (1225–1526). In a set of experiments we look at the difference between withincorpus and across-corpus accuracy, and explore ways of mitigating the effects of variation and data sparseness by adding different types of dictionary information. Combining several methods, together with a simple approach to handle spelling...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999